Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
1.
Progress in Biomedical Optics and Imaging - Proceedings of SPIE ; 12467, 2023.
Article in English | Scopus | ID: covidwho-20235035

ABSTRACT

MIDRC was created to facilitate machine learning research for tasks including early detection, diagnosis, prognosis, and assessment of treatment response related to the COVID-19 pandemic and beyond. The purpose of the Technology Development Project (TDP) 3c is to create resources to assist researchers in evaluating the performance of their machine learning algorithms. An interactive decision tree has been developed, organized by the type of task that the machine learning algorithm is being trained to perform. The user can select information such as: (a) the type of task, (b) the nature of the reference standard, and (c) the type of the algorithm output. Based on the user responses, they can obtain recommendations regarding appropriate performance evaluation approaches and metrics, including literature references, short video tutorials, and links to available software. Five tasks have been identified for the decision tree: (a) classification, (b) detection/localization, (c) segmentation, (d) time-to-event analysis, and (e) estimation. As an example, the classification branch of the decision tree includes binary and multi-class classification tasks and provides suggestions for methods and metrics as well as software recommendations, and literature references for situations where the algorithm produces either binary or non-binary (e.g., continuous) output and for reference standards with negligible or non-negligible variability and unreliability. The decision tree has been made publicly available on the MIDRC website to assist researchers in conducting task-specific performance evaluations, including classification, detection/localization, segmentation, estimation, and time-to-event tasks. © COPYRIGHT SPIE. Downloading of the is permitted for personal use only.

2.
Progress in Biomedical Optics and Imaging - Proceedings of SPIE ; 12465, 2023.
Article in English | Scopus | ID: covidwho-20233626

ABSTRACT

Assessing the generalizability of deep learning algorithms based on the size and diversity of the training data is not trivial. This study uses the mapping of samples in the image data space to the decision regions in the prediction space to understand how different subgroups in the data impact the neural network learning process and affect model generalizability. Using vicinal distribution-based linear interpolation, a plane of the decision region space spanned by the random 'triplet' of three images can be constructed. Analyzing these decision regions for many random triplets can provide insight into the relationships between distinct subgroups. In this study, a contrastive self-supervised approach is used to develop a 'base' classification model trained on a large chest x-ray (CXR) dataset. The base model is fine-tuned on COVID-19 CXR data to predict image acquisition technology (computed radiography (CR) or digital radiography (DX) and patient sex (male (M) or female (F)). Decision region analysis shows that the model's image acquisition technology decision space is dominated by CR, regardless of the acquisition technology for the base images. Similarly, the Female class dominates the decision space. This study shows that decision region analysis has the potential to provide insights into subgroup diversity, sources of imbalances in the data, and model generalizability. © 2023 SPIE.

3.
Progress in Biomedical Optics and Imaging - Proceedings of SPIE ; 12469, 2023.
Article in English | Scopus | ID: covidwho-20233027

ABSTRACT

The Medical Imaging and Data Resource Center (MIDRC) is a multi-institutional effort to accelerate medical imaging machine intelligence research and create a publicly available data commons as well as a sequestered commons for performance evaluation of algorithms. This work sought to evaluate the currently implemented methodology for apportioning data to the public and sequestered data commons by investigating the resulting distributions of joint demographic characteristics between the public and sequestered commons. 54,185 patients whose de-identified imaging studies and metadata had been submitted to MIDRC were previously separated into public and sequestered commons using a multi-dimensional stratified sampling method, resulting in 41,556 patients (77%) in the public commons and 12,629 patients (23%) in the sequestered commons. To compare the balance obtained in the joint distributions of patient characteristics from use of the developed sequestration method, patients from each commons were separated into bins, representing a unique combination of the demographic variables of COVID-19 status, age, race, and sex assigned at birth. The joint distributions of patients were visualized, and the absolute and percent difference in each bin from an exact 77:23 split of the data were calculated. Results indicated 75.9% of bins obtained differences of less than 15 patients, with a median difference of 3.6 from the total data for both public and sequestered commons. Joint distributions of patient characteristics in both the public and sequestered commons closely matched each other as well as that of the total data, indicating the sequestration by stratified sampling method has operated as intended. © 2023 SPIE.

4.
Medical Imaging 2022: Image Perception, Observer Performance, and Technology Assessment ; 12035, 2022.
Article in English | Scopus | ID: covidwho-1901882

ABSTRACT

The Medical Imaging and Data Resource Center (MIDRC) is a multi-institutional effort to accelerate medical imaging machine intelligence research and create a publicly available image repository/commons as well as a sequestered database for performance evaluation and benchmarking of algorithms. After de-identification, approximately 80% of the medical images and associated meta-data will become part of the open repository and 20% will be sequestered and kept separate from the open commons. To ensure that both the public, open dataset and the sequestered dataset are representative of the population available, demographic characteristics across the two datasets must be balanced. Our method uses multidimensional stratified sampling where several demographic variables of interest are sequentially used to separate the data into individual strata, each representing a unique combination of variables. Within each stratum, patients are randomly assigned to the open set (80%) or the sequestered set (20%). Thus, for p variables of interest, the balance of the pdimensional distribution of variable combinations can be controlled. This algorithm was used on an example COVID-19 dataset containing image exams of 4662 patients using the variables of race, age, sex at birth, and ethnicity, each containing 8, 8, 2, and 4 categories, respectively. After stratification of this dataset into the two subsets, resulting distributions of each variable matched the distribution from the original dataset with a maximum percent difference from its original fraction of 0.4%. These results demonstrate that the implemented process of multi-dimensional sequential stratified sampling can partition a large database while maintaining balance across several variables. © 2022 SPIE. All rights reserved.

SELECTION OF CITATIONS
SEARCH DETAIL